Trace Equivalence Characterization Through Reinforcement Learning

نویسندگان

  • Josée Desharnais
  • François Laviolette
  • Krishna Priya Darsini Moturu
  • Sami Zhioua
چکیده

In the context of probabilistic verification, we provide a new notion of trace-equivalence divergence between pairs of Labelled Markov processes. This divergence corresponds to the optimal value of a particular derived Markov Decision Process. It can therefore be estimated by Reinforcement Learning methods. Moreover, we provide some PACguarantees on this estimation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Testing Probabilistic Equivalence Through Reinforcement Learning

We propose a new approach to verification of probabilistic processes for which the model may not be available. We use a technique from Reinforcement Learning to approximate how far apart two processes are by solving a Markov Decision Process. If two processes are equivalent, the algorithm will return zero, otherwise it will provide a number and a test that witness the non equivalence. We sugges...

متن کامل

Testing Stochastic Processes through Reinforcement Learning

We propose a new approach to verification of probabilistic processes for which the model may not be available. We show how to use a technique from Reinforcement Learning to approximate how far apart two processes are by solving a Markov Decision Process. The key idea of the approach is to define the MDP out of the processes to be tested, in such a way that the optimal value is interpreted as a ...

متن کامل

Dynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)

In this paper we focus on the application of reinforcement learning to obstacle avoidance in dynamic Environments in wireless sensor networks. A distributed algorithm based on reinforcement learning is developed for sensor networks to guide mobile robot through the dynamic obstacles. The sensor network models the danger of the area under coverage as obstacles, and has the property of adoption o...

متن کامل

An Analysis of Actor/Critic Algorithms Using Eligibility Traces: Reinforcement Learning with Imperfect Value Function

We present an analysis of actor/critic algorithms, in which the actor updates its policy using eligibility traces of the policy parameters. Most of the theoretical results for eligibility traces have been for only critic's value iteration algorithms. This paper investigates what the actor's eligibility trace does. The results show that the algorithm is an extension of Williams' REINFORCE algori...

متن کامل

Actual return reinforcement learning versus Temporal Di erences : Some theoretical and experimental results

This paper argues that for many domains, we can expect credit-assignment methods that use actual returns to be more eeective for reinforcement learning than the more commonly used temporal diierence methods. We present analysis and empirical evidence from three sets of experiments in diierent domains to support this claim. A new algorithm we call C-Trace, a variant of the P-Trace RL algorithm i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006